Not all operators are available through the query expression
syntax built into the C# compiler, and to use the remaining operators
(or to call your own operators), the extension method query syntax or a
combination of the two is necessary. You will continually need to know
both styles of query syntax in order to read, write, and understand code
written using LINQ.
Extension method format
(also known as the dot notation syntax)—The extension method format is
simply where multiple extension methods are cascaded together, each
returning an IEnumerable<T> result to allow the next extension method to flow on from the previous result and so on (known as a fluid interface).
int[] nums = new int[] {0,4,2,6,3,8,3,1};
var result1 = nums.Where(n => n < 5).OrderBy (n => n);
// or with line-breaks added for clarity
var result2 = nums
.Where(n => n < 5)
.OrderBy (n => n);
Query Expression format
(preferred, especially for joins and groups)—Although not all standard
query operators are supported by the query expression syntax, the
benefit to the clarity of code when they are is very high. The query
expression syntax is much gentler than the extension method syntax in
that it simplifies the syntax by removing lambda expressions and by
introducing a familiar SQL-like representation.
int[] nums = new int[] {0,4,2,6,3,8,3,1};
var result = from n in nums
where n < 5
orderby n
select n;
Query Dot syntax
(a combination of the two formats)—This format combines a query
expression syntax query surrounded by parentheses, followed by more
operators using the Dot Notation syntax. As long as the query expression
returns an IEnumerable<T>, it can be followed by an extension method chain.
int[] nums = new int[] {0,4,2,6,3,8,3,1};
var result = (from n in nums
where n < 5
orderby n
select n).Distinct();
Each query syntax has its own merits and pitfalls, which the following sections cover in detail.
Query Expression Syntax
The query expression syntax
provided in C# 3.0 and later versions makes queries clearer and more
concise. The compiler converts the query expression into extension
method syntax during compilation, so the choice of which syntax to use
is based solely on code readability.
Figure 1 shows the basic form of query expressions built into C# 3.0.
Note
The
fact that the order of the keywords is different in SQL is unfortunate
for those who are SQL masters; however, one very compelling reason for
the difference was to improve the developer experience. The
From-Where-Select order allows the editing environment (Visual Studio in
this case) to provide full Intellisense support when writing the query.
The moment you write the from clause, the properties of that element appear as you then write the where
clause. This wouldn’t be the case (and isn’t in SQL Server’s query
editing tools) if the C# designers followed the more familiar
Select-From-Where keyword ordering.
Most of the query
expression syntax needs no explanation for developers experienced with
other query syntax, like SQL. Although the order is different than in
traditional query languages, each keyword name gives a strong indication
as to its function, the exception being the let and into clauses, which are described next.
Let—Create a Local Variable
Queries can often be
written with less code duplication by creating a local variable to hold
the value of an intermediate calculation or the result of a subquery.
The let keyword enables you to keep the
result of an expression (a value or a subquery) in scope throughout the
rest of the query expression being written. Once assigned, the variable
cannot be reassigned with another value.
In the following code, a local variable is assigned, called average,
that holds the average value for the entire source sequence, calculated
once but used in the Select projection on each element:
var variance = from element in source
let average = source.Average()
select Math.Pow((element - average), 2);
The let keyword is implemented purely by the compiler, which creates an anonymous type that contains both the original range variable (element in the previous example) and the new let variable. The previous query maps directly to the following (compiler translated) extension method query:
var variance =
source.Select (
element =>
new
{
element = element,
average = source.Average ()
}
)
.Select (temp0 =>
Math.Pow (
((double)temp0.element - temp0.average)
, 2));
Each additional let
variable introduced will cause the current anonymous type to be
cascaded within another anonymous type containing itself and the
additional variable—and so on. However, all of this magic is transparent
when writing a query expression.
Into—Query Continuation
The group, join, and select
query expression keywords allow the resulting sequence to be captured
into a local variable and then used in the rest of the query. The into keyword allows a query to be continued by using the result stored into the local variable at any point after its definition.
As a quick preview, the following example
groups all elements of the same value and stores the result in a
variable called groups; by using the into keyword (in combination with the group keyword), the groups variable can participate and be accessed in the remaining query statement.
var groupings = from element in source
group element by element into groups
select new {
Key = groups.Key,
Count = groups.Count()
};
Comparing the Query Syntax Options
Listing 1 uses extension method syntax, and Listing 3-2 uses query expression syntax, but they are functionally equivalent, with both generating the identical result shown in Output 1.
The clarity of the code in the query expression syntax stems from the
removal of the lambda expression semantics and the SQL style operator
semantics. Both syntax styles are functionally identical, and for simple
queries (like this example), the benefit of code clarity is minimal.
Listing 1. Query gets all contacts in the state of “WA” ordered by last name
and then first name using extension method query syntax—see Output 1
List<Contact> contacts = Contact.SampleData();
var q = contacts.Where(c => c.State == "WA") .OrderBy(c => c.LastName) .ThenBy(c => c.FirstName);
foreach (Contact c in q) Console.WriteLine("{0} {1}", c.FirstName, c.LastName);
|
Listing 2. The same query as in Listing 1 except using query expression syntax—see Output 1
List<Contact> contacts = Contact.SampleData();
var q = from c in contacts where c.State == "WA" orderby c.LastName, c.FirstName select c;
foreach (Contact c in q) Console.WriteLine("{0} {1}", c.FirstName, c.LastName);
|
Output 1.
Stewart Kagel Chance Lard Armando Valdes
|
There are
extensive code readability advantages to using the query expression
syntax over the extension method syntax when your query contains join
and/or group functionality. Although not all joining and grouping
functionality is natively available to you when using the query
expression syntax, the majority of queries you write will not require
those extra features. Listing 3
demonstrates the rather clumsy extension method syntax for Join (clumsy
in the fact that it is not clear what each argument means in the GroupBy
extension method just by reading the code). The functionally equivalent
query expression syntax for this same query is shown in Listing 4. Both queries produce the identical result, as shown in Output 2.
If it is not clear already, my personal preference is to use the query expression syntax whenever a Join or GroupBy
operation is required in a query. When a standard query operator isn’t
supported by the query expression syntax (as is the case for the .Take method for example), you parenthesize the query and use extension method syntax from that point forward as Listing 4 demonstrates.
Listing 3. Joins become particularly complex in extension method syntax. This
query returns the first five call-log details ordered by most
recent—see Output 2
List<Contact> contacts = Contact.SampleData(); List<CallLog> callLog = CallLog.SampleData();
var q = callLog.Join(contacts, call => call.Number, contact => contact.Phone, (call, contact) => new { contact.FirstName, contact.LastName, call.When, call.Duration }) .OrderByDescending(call => call.When) .Take(5);
foreach (var call in q) Console.WriteLine("{0} - {1} {2} ({3}min)", call.When.ToString("ddMMM HH:m"), call.FirstName, call.LastName, call.Duration);
|
Listing 4. Query expression syntax of the query identical to that shown in Listing 3—see Output 2
List<Contact> contacts = Contact.SampleData(); List<CallLog> callLog = CallLog.SampleData();
var q = (from call in callLog join contact in contacts on call.Number equals contact.Phone orderby call.When descending select new { contact.FirstName, contact.LastName, call.When, call.Duration }).Take(5);
foreach (var call in q) Console.WriteLine("{0} - {1} {2} ({3}min)", call.When.ToString("ddMMM HH:m"), call.FirstName, call.LastName, call.Duration);
|
Output 2.
07Aug 11:15 - Stewart Kagel (4min) 07Aug 10:35 - Collin Zeeman (2min) 07Aug 10:5 - Mack Kamph (1min) 07Aug 09:23 - Ariel Hazelgrove (15min) 07Aug 08:12 - Barney Gottshall (2min)
|
Express the most limiting query method first; this reduces the workload of the successive operators. Split
each operator onto a different line (including the period joiner). This
allows you to comment out individual operators when debugging. Be consistent—within an application use the same style throughout. To make it easier to read queries, don’t be afraid to split up the query into multiple parts and indent to show hierarchy.
|
If you need to mix extension methods with query expressions, put them at the end.
Keep
each part of the query expression on a separate line to allow you to
individually comment out individual clauses for debugging.